26 research outputs found
Marathon: An open source software library for the analysis of Markov-Chain Monte Carlo algorithms
In this paper, we consider the Markov-Chain Monte Carlo (MCMC) approach for
random sampling of combinatorial objects. The running time of such an algorithm
depends on the total mixing time of the underlying Markov chain and is unknown
in general. For some Markov chains, upper bounds on this total mixing time
exist but are too large to be applicable in practice. We try to answer the
question, whether the total mixing time is close to its upper bounds, or if
there is a significant gap between them. In doing so, we present the software
library marathon which is designed to support the analysis of MCMC based
sampling algorithms. The main application of this library is to compute
properties of so-called state graphs which represent the structure of Markov
chains. We use marathon to investigate the quality of several bounding methods
on four well-known Markov chains for sampling perfect matchings and bipartite
graph realizations. In a set of experiments, we compute the total mixing time
and several of its bounds for a large number of input instances. We find that
the upper bound gained by the famous canonical path method is several
magnitudes larger than the total mixing time and deteriorates with growing
input size. In contrast, the spectral bound is found to be a precise
approximation of the total mixing time
Gerbil: A Fast and Memory-Efficient -mer Counter with GPU-Support
A basic task in bioinformatics is the counting of -mers in genome strings.
The -mer counting problem is to build a histogram of all substrings of
length in a given genome sequence. We present the open source -mer
counting software Gerbil that has been designed for the efficient counting of
-mers for . Given the technology trend towards long reads of
next-generation sequencers, support for large becomes increasingly
important. While existing -mer counting tools suffer from excessive memory
resource consumption or degrading performance for large , Gerbil is able to
efficiently support large without much loss of performance. Our software
implements a two-disk approach. In the first step, DNA reads are loaded from
disk and distributed to temporary files that are stored at a working disk. In a
second step, the temporary files are read again, split into -mers and
counted via a hash table approach. In addition, Gerbil can optionally use GPUs
to accelerate the counting step. For large , we outperform state-of-the-art
open source -mer counting tools for large genome data sets.Comment: A short version of this paper will appear in the proceedings of WABI
201
Timing of Train Disposition: Towards Early Passenger Rerouting in Case of Delays
Passenger-friendly train disposition is a challenging, highly complex online optimization problem with uncertain and incomplete information about future delays. In this paper we focus on the timing within the disposition process. We introduce three different classification schemes to predict as early as possible the status of a transfer: whether it will almost surely break, is so critically delayed that it requires manual disposition, or can be regarded as only slightly uncertain or as being safe. The three approaches use lower bounds on travel times, historical distributions of delay data, and fuzzy logic, respectively. In experiments with real delay data we achieve an excellent classification rate. Furthermore, using realistic passenger flows we observe that there is a significant potential to reduce the passenger delay if an early rerouting strategy is applied
Increased bioavailability of phenolic acids and enhanced vascular function following intake of feruloyl esterase-processed high fibre bread: a randomized, controlled, single blind, crossover human intervention trial
Background & aims
Clinical trial data have indicated an association between wholegrain consumption and a reduction in surrogate markers of cardiovascular disease. Phenolics present in wholegrain bound to arabinoxylan fibre may contribute these effects, particularly when released enzymatically from the fiber prior to ingestion. The aim of the present study was therefore to determine whether the intake of high fibre bread containing higher free ferulic acid (FA) levels (enzymatically released during processing) enhances human endothelium-dependent vascular function.
Methods
A randomized, single masked, controlled, crossover, human intervention study was conducted on 19 healthy men. Individuals consumed either a high fibre flatbread with enzymatically released free FA (14.22 mg), an equivalent standard high fibre bread (2.34 mg), or a white bread control (0.48 mg) and markers of vascular function and plasma phenolic acid concentrations were measured at baseline, 2, 5 and 7 h post consumption.
Results
Significantly increased brachial arterial dilation was observed following consumption of the high free FA (‘enzyme-treated’) high fibre bread verses both a white bread (2 h: p 0.05).
Conclusion
Dietary intake of bread, processed enzymatically to release FA from arabinoxylan fiber during production increases the bioavailability of FA, and induces acute endothelium-dependent vasodilation.
Clinical trial registry: No
NCT03946293.
Website
www.clinicaltrials.gov
The quality of the upper bounds for rapidly mixing instances.
<p>The results of <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0147935#pone.0147935.g003" target="_blank">Fig 3</a> filtered to highlight instances with known polynomial mixing time. Instances with no known polynomial bound are coloured gray.</p
Influence of the average vertex degree.
<p>Connection between average vertex degree of a state graph and its total mixing time, respectively canonical path bound.</p
Single and double precision performance of the total mixing time computation.
<p>The charts show the running time for the computation of the total mixing time on the example of five state graphs of size 8012 to 20358. Due to the relatively small amount of GPU memory on our test system, only the first four (respectively two) state graphs could be processed by the GPU implementation in single precision mode (respectively double precision mode). The running times were measured on an Ubuntu 14.04 system with a Intel Xeon E3-1231, NVIDIA GeForce GTX 970 (4 GB GPU memory) and 16 GB of main memory, using <i>gcc</i> in version 4.8.4 and <i>CUDA</i> in version 7.0.</p
Relationship between the lower spectral bound and the total mixing time.
<p>The total mixing time is shown in connection to a corresponding lower spectral bound for sequence pairs of the form (<i>n</i> − 1, <i>n</i> − 2, 2, 1), (2, 2, …, 2). We use the displayed formulas to predict missing values for total mixing time.</p